Automated Fault Recovery Planning in Cloud Computing

نویسنده

  • Pavlo Kerestey
چکیده

This work investigates the applicability of the automated planning approaches to fault management in cloud computing implementations on the infrastructure as a service level. A decision support solution for the fault management in cloud computing is examined to identify the possibility of the automation of fault recovery in large scale cloud computing deployments. Cloud computing is a fairly new topic with increased industrial interest. Cloud computing services are popular due to their flexible resource allocation and optimal economic usage. This allows to avoid underand over-utilization of the computing resources and makes planning and management less cost-intensive task. At present, no good cloud computing management solution for fault recovery exists, which makes cloud computing services unattractive to many potential users. As mistakes do happen in every system it must be possible for a cloud service provider to guarantee that the terms of provisioning will not be breached even when faults happen. This can be achieved by automating error-prone and time-consuming tasks. Therefore the aim of the fault recovery solution examined in this work is the time minimization of complete service recovery. To diminish the problem, an automated planning approach in the field of artificial intelligence is chosen as a solution. In addition, this work is based on operation research studies. The aim is to create a prototype of a decision support solution, which will help to lessen the complexity of fault recovery and also the expenses for the whole fault management. A system and its services should recover from different kinds of faults using fast and a systematic composition of recovery plans. A scenario will be created in cooperation with internet and computing provider Global Access GmbH and cloud computing provider Zimory GmbH to prove the usefulness of the solution. The aim is a machine aided improvement of IT service availability. This work explores existing approaches of automated planning and uses planning applications in grid computing. It targets the analysis of the applicability of automated planning approaches for the fault management in cloud computing. An automated planning algorithm is examined and a prototype is implemented for a scenario to prove that functionality of the planning system is given.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Genetic Based Resource Management Algorithm Considering Energy Efficiency in Cloud Computing Systems

Cloud computing is a result of the continuing progress made in the areas of hardware, technologies related to the Internet, distributed computing and automated management. The Increasing demand has led to an increase in services resulting in the establishment of large-scale computing and data centers, in addition to high operating costs and huge amounts of electrical power consumption. Insuffic...

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

Real-Time Building Information Modeling (BIM) Synchronization Using Radio Frequency Identification Technology and Cloud Computing System

The online observation of a construction site and processes bears significant advantage to all business sector. BIM is the combination of a 3D model of the project and a project-planning program which improves the project planning model by up to 6D (Adding Time, Cost and Material Information dimensions to the model). RFID technology is an appropriate information synchronization tool between the...

متن کامل

An Architecture for Supporting Network Fault Recovery Management

Highly available and resilient networks play a decisive role in today’s networked world. As network faults are inevitable and networks are becoming constantly intricate, finding effective fault recovery solutions in a timely manner is becoming a challenging task for administrators. Therefore, an automated mechanism to support fault resolution is essential towards efficient fault handling proces...

متن کامل

Task Scheduling Algorithm Using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in Cloud Computing

The cloud computing is considered as a computational model which provides the uses requests with resources upon any demand and needs.The need for planning the scheduling of the user's jobs has emerged as an important challenge in the field of cloud computing. It is mainly due to several reasons, including ever-increasing advancements of information technology and an increase of applications and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010